Tutorial Part II — R goes Spatial
## this might be useful if rtools is not yet installed manually
#install.packages("installr")
#installr::install.rtools()Let me start with a big thanks to Cedric Scherer, Moritz Wenzler, Pierre Gras and Juergen Niedballa who constantly helped to improve the course since its start in 2014! I am deeply indebted to you!
Introduction
The course is intended to be a quick introduction to using R as a
GIS. It builds on the previous course, Course1_RIntro, and is written in
such a way that you can understand handling spatial data when knowing
the basics of handling data.frames, matrices and lists. There is no need
to know tidyverse/dplyr coding style.
This course provides the basics for using the most important spatial data types — vector data (points, lines, polygons, aka ESRI shapefiles) and raster data.
Recently, {sf} objects have been developed to handle
vector data more easily. Those objects can be treated like simple
data.frames. For raster data there are fast and modern
packages — namely {terra} and {stars}. The
{terra} package is also especially useful for remote
sensing data; it has a single SpatRaster class. The course
is updated to use the {terra} package.
In addition to different data types, the course provides sections on coordinate projections and the most important geo-spatial operations. Last but not least - we plot a lot of maps to ease data visualization and because it is fun.
The data we use stem from a longstanding project we run in Borneo. Please refer to our website https://ecodynizw.github.io/ for further information. The project involves species conservation and large scale landscape planning. To learn more about the data used in the course and the project, please refer to the following publications that are freely available:
Targeted conservation to safeguard a biodiversity hotspot from climate and land-cover change. MJ Struebig, et al. 2015. Current Biology 25 (3), 372-378. https://doi.org/10.1016/j.cub.2014.11.067
Anticipated climate and land‐cover changes reveal refuge areas for Borneo’s orang-utans. MJ Struebig, et al. 2015. Global Change Biology 21 (8), 2891-2904. https://doi.org/10.1111/gcb.12814
The importance of correcting for sampling bias in MaxEnt species distribution models. S Kramer‐Schadt et al. 2013. Diversity and Distributions 19 (11), 1366-1379. https://doi.org/10.1111/ddi.12096
The Borneo carnivore database and the application of predictive distribution modelling. S Kramer-Schadt, et al.2016. Raffles Bulletin of Zoology, Supplement No. 33: 18–41. https://lkcnhm.nus.edu.sg/app/uploads/2017/06/S33rbz018-041-1.pdf
Useful (web)sites
1. R news and tutorials
2. Quick introduction to spatial R
- https://rspatial.org/
- http://pakillo.github.io/R-GIS-tutorial/
- http://rstudio-pubs-static.s3.amazonaws.com/7993_6b081819ba184047802a508a7f3187cb.html
3. Spatial visualisation
- check out the packages
{tmap},{ggmap},{leaflet}and{cartography}
- https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html
- A whole book: https://clauswilke.com/dataviz/ with spatial data here: https://clauswilke.com/dataviz/geospatial-data.html
4. Overview of spatial packages
- http://cran.r-project.org/web/views/Spatial.html =
http://cran.r-project.org/view=Spatial
- https://www.r-spatial.org/
5. R-cheatsheets — great for learning
‘vocabulary’
* e.g. here: https://www.rstudio.com/resources/cheatsheets/
6. Infos about simple features in R ({sf}
package)
- https://r-spatial.github.io/sf/index.html
- and nice intro: https://oliviergimenez.github.io/introspatialR/#1
- geocomputation: https://geocompr.robinlovelace.net/spatial-operations.html
7. Lots of EPSG Codes for map projections
* Use EPSG codes as unique and specific identifies of your coordinate
reference systems instead of writing projection details.
* for EPSG codes see http://spatialreference.org/ or https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pd
8. BioMove Graduate School specials:
- For analysis of plot-based biodiversity data: package
{vegan}
- For analysis of telemetry data: packages
{adehabitatHR},{adehabitatLT},{move},{recurse},{momentuHMM},{moveHMM},{ctmm}
Further reading
1. Check Roger Bivand’s publications:
* http://cran.r-project.org/web/views/Spatial.html
* http://www.csiss.org/learning_resources/index.html
* Roger Bivand et al. 2008, Applied Spatial Data Analysis in R,
Springer
2. Check Edzer Pebesma’s course materials and
publications:
* http://ifgi.uni-muenster.de/~epebe_01/
* https://edzer.github.io/UseR2017/
* https://www.rstudio.com/resources/videos/tidy-spatial-data-analysis/
3. Geocomputation with R: a open-access book of Robin
Lovelace, Jakub Nowosad & Jannes Muenchow
* https://geocompr.robinlovelace.net
4. DataVizArt Cedric Scherer -> check out his #TidyTuesday
or #30DaysMapChallenge contributions
* https://www.cedricscherer.com/top/dataviz/
* Codes: https://github.com/z3tt/TidyTuesday and https://github.com/z3tt/30DayMapChallenge
Basics
How to install…
pkgs <- c("terra", "stars", "sf", "sp", "rasterVis",
"ggplot2", "tmap", "viridis", "patchwork", "here", "units",
"devtools", "osmdata", "elevatr","tanaka","jpeg")
## install packages that are not installed yet
## (not important to understand the following code, just run it)
unavailable <- setdiff(pkgs, rownames(installed.packages()))
install.packages(unavailable)
## -------- Please run the following ---------------------
## install development version of rnaturalearth as currently the
## download doesn't work in the CRAN package version
#devtools::install_github("ropensci/rnaturalearth")
## install rgeoboundaries from GitHub (not available on CRAN yet)
#devtools::install_github("wmgeolab/rgeoboundaries", force = TRUE)…and load packages
## when working with spatial data
## do NOT load both packages {terra} and the old {raster} at the same time,
## as this creates problems with the namespace!!!
library(terra)
library(rasterVis)
## working with vector data
library(sf) ## the "new" {sp} package
library(sp) ## the old one. Still has some nice functionalities
library(stars)
## visualization
library(ggplot2)
library(tmap)
library(viridis) ## nice colour palettes
library(patchwork) ## to combine plots
library(units) ## handle measurement data
library(here) ## for easy directory management
## depending on the sequence of installation
## you might need to reinstall this. But wait for an error message.
#install.packages("Rcpp", repos="https://rcppcore.github.io/drat")How to call help
Remember to use ‘?’ or ‘??’ if you want information about a function of package. If you do not understand some of the code below, always check the arguments and help of the function. Vignettes are helpful code demonstrations with examples.
## Information about a function or a package.
?terra
## search for an item:
??SpatialPolygons
## show the instructions of a package:
vignette(package = "sf") # vignette(package="sp")
## load the instruction of a package (embedded pdf):
vignette("sf1")
## For a better version of the sf vignettes see
# https://r-spatial.github.io/sf/articles/Using the namespace
Important is setting the namespace. Sometimes, packages use the same name for a function that does different things (like plot), so it is helpful to set the namespace with a :: . That is, provide the library/ package from which you want to use the function with ‘package::function’.
Set the workspace
If you have downloaded the repository for the course as described
here (https://ecodynizw.github.io/teaching.html), you
automatically have adopted our folder structure. Nothing more to do!
Continue with chapter ‘R-projects and package
{here}’.
Click here for optional - use own course folder setup and R-project
Optional - setting workspace by hand and learn about R-projects
If you have not downloaded the course folder structure with the
R-project, you can also do it ‘manually’: Set your root directory
setwd(). You could name it ‘d6_teaching_collection’.
Create the subfolders relative to root-folder. Please adjust to your setup. A possible folder structure could be:
.
└── <your folder name> ─ root folder , e.g. d6_teaching_collection
├── data ─ data
│ └── data_borneo ─ e.g., the Borneo data
│ ├── geo_raster_current_asc ─ geo data, raster ascii format, as in data_borneo
│ └── animal_data
└── R ─ store here all your scripts, i.e.
├── my_script_course1.R
└── my_script_course2.RTraditionally, the working directory was ‘hard coded’, i.e. the full
path was specified with setting the work directory by
setwd().
## adjust the working directory [wd]
## getwd() ## check where you are
##
## ## Set your OWN root directory, the following is just an example of how to do it:
## #root_wd <- setwd("C:/Users/kramer/d6_teaching_collection")This approach is error-prone and it complicates the cooperation with others as they have to update the working directory in every script they receive from you. (And if they don’t change it back to your directory, you also have to update it again…)
Nowadays, the best approach is to work with R projects that are associated with own working directory, workspace, history, and source documents.
You can create a project in the RStudio IDE by using the
Create Project command (available on the Projects menu and
on the global toolbar). In our case, we want to create a project in an
existing directory where there is already R code and data placed inside
that given folder. Now, a hidden folder called .Rproj.user
and most importantly a file called your_project.Rproj
should appear in your folder. Every time you want to work on your
project, you open the Rstudio session by double-clicking on that
.Rproj file, which ensures that the root working directory
is correctly set.
End of optional.
R-projects and package {here}
When using R projects, the package {here} is a handy
helper:
The goal of the {here} package is to enable easy file
referencing in project-oriented workflows. In contrast to using
setwd(), which is fragile and dependent on the way you
organize your files (see optional section above for details),
{here} uses the top-level (=root) directory of an
R-project to easily build paths to files.
The here() function from the {here} package
searches for the .Rproj file and will allow you to
construct relative paths easily within your project environment:
# tip: use the namespace here:: before calling to function here()
# because here() alone interferes with the package `purrr` if that is loaded
here::here() ## [1] "C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection"
here::dr_here()
# this is an example, ob how a long folder path can be assigned to a short name (root_wd)
root_wd <- here::here() # the root folder is automatically set
root_wd ## [1] "C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection"
Now, we assign the path to these folders that contain our data relative to the root directory, and we work with the Borneo data:
.
└── root_wd ─ e.g. d6_teaching_collection
└── dbor_wd ─ data / data_borneo
├── maps_wd ─ geo data
└── anim_wd ─ animal location data## Now, create the linkages to the folders:
# dbor_wd <- paste0(root_wd, "/", "data/data_borneo") #- the old way
dbor_wd <- here("data","data_borneo") #- note the nested folder structure
# maps_wd <- here("data","data_borneo","geo_raster_current_asc") ## same as:
maps_wd <- paste0(dbor_wd, "/geo_raster_current_asc")
# mind difference paste0() and paste() :
# paste() needs a separator sign, in this case no space ''
vecs_wd <- paste0(dbor_wd, "/geo_vector")
anim_wd <- paste(dbor_wd, "/animal_data", sep = '') Create an output folder where you can temporarily store your results (anything you plot, create, want to save,….) with dir.create. If you use our downloaded course repository, you already have this folder.
## [1] "C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection/output"
Finally, your folder structure will look like this:
.
└── <your folder name> ─ root folder , e.g. d6_teaching_collection
├── data ─ data
│ ├── data_borneo ─ e.g., the Borneo data
│ ├── geo_raster_current_asc ─ geo data, raster ascii format, as in data_borneo
│ └── animal_data
├── R ─ store here all your scripts, i.e.
│ ├── my_script_course1.R
│ └── my_script_course2.R
├── output ─ for any results/ plots,...
└ <your R project name.Rproj> ─ in case you are working with an R-ProjectA small yet important hint on file and folder naming: Do NOT! use dots (NOT: folder.name), spaces, comma, semicolon, or any special signs like ö, ä, ü, ß, ! etc. in your folder names, and keep name length at a maximum of 13 letters!
Access folder content
You will find a lot of bioclimatic raster files (.asc) in the
geo_raster_current_asc folder, which we named
maps_wd above:
## create an object called 'filename' and store the names of the files
filenames <- list.files(path = maps_wd)
## show the first six files
head(x = filenames)## [1] "bio_asc_01.asc" "bio_asc_21.asc" "bio_asc_22.asc" "bio_asc_24.asc"
## [5] "bio_asc_27.asc" "bio_asc_42.asc"
Should this directory be empty, then you have wrongly set the root directory and the data directory. Please check again how your folder structure looks like. By default, this folder should contain six ascii-files. The description what info these maps contain is provided in the DataDescription_Borneo.doc in the data_borneo folder that you downloaded with the repository.
Recap from Course 1 Basics in R
useful functions:
head(): only shows the first 6 elements, i.e. if you have a large data frame, you can quickly check header and the first entries (analogously:tail())paste(x, sep=''): appends characters or numbers — great for working directories, loops across many maps or data.frames etc
data formats and access:
data.frame(): [row, column] or “dollar sign”, i.e.mydata[1,2]→ first row, second column, ormydata$MyColumnName[1]matrix(): see abovelist(): [[element]], i.e.mylist[[1]]. Attention! A list element can consist of a data.frame that you can then access like that:mylist[[1]][1,2]…
S4-objects (like spatial data)
- access via
@, i.e.SpatRaster@data@values
Raster data
Data import
First, load a raster map. Please check the description in the folder called DataDescription_Borneo.doc. The file bio_asc_01.asc is showing mean annual temperature multiplied by 10. Why? Because it can be stored as integer, not floating type (i.e. with digits), which saves a lot of storage and memory.
## ## read the ascii file as raster format
ras_bio_asc_01 <- rast(x = paste0(maps_wd, "/bio_asc_01.asc"))
## or use function 'here' (which in this case is much longer.....)
# ras_bio_asc_01 <- rast(x = here("data", "data_borneo","geo_raster_current_asc", "bio_asc_01.asc"))
## copy the raster map and give it a sensible name
Bor_mat <- ras_bio_asc_01 ## easy copying of whole maps; mat stands for mean annual tempNow have a look at the object:
## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)
## source : bio_asc_01.asc
## name : bio_asc_01
What does the slot with extent tells you? Looks like coordinates in decimal degrees, i.e. latitude and longitude.
Now have a look at the slot coord. ref. (or
crs. CRS stands for coordinate
reference system): Nicely in the new
{terra}-package, the CRS (map projection) is automatically
set in the following way: If the crs argument is missing
when loading data with rast(x= ..., crs= ...), and the x
coordinate values are within -360 .. 360 and the y coordinate values are
within -90 .. 90, automatically “+proj=longlat +datum=WGS84” is
assigned. WGS84 is the global reference system for geospatial
information and is the reference system for the Global Positioning
System (GPS). In this case, (OGC:CRS84) stands for ‘Open Geospatial
Consortium’ OGC standards to represent the coordinate reference system
WGS84. The parameters for this CRS are now stored in so-called
‘well-known text’ (WKT) files. We will come back to that later.
However, sometimes the slot coord. ref. is empty,
i.e. if other coordinates than decimal degrees are used, e.g. UTM. That
means, without knowing the reference system, the map cannot be plotted
at the exact location on the globe. Thus, when you get a map, always
make sure you also get the information of the coordinate reference
system! If it is not yet set/ assigned (because you import a raster map
as a simple matrix or points from a file), this is the first thing you
should do!
# str(Bor_mat) ## check this
crs(Bor_mat) ## crs = coordinate reference system. If empty, assign it!## [1] "GEOGCRS[\"WGS 84 (CRS84)\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n CS[ellipsoidal,2],\n AXIS[\"geodetic longitude (Lon)\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n AXIS[\"geodetic latitude (Lat)\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"World\"],\n BBOX[-90,-180,90,180]],\n ID[\"OGC\",\"CRS84\"]]"
In this case, it was automatically assigned, so nothing to do for now.
Set the CRS
Important! Please read and repeat basic GIS-knowledge on coordinate reference systems (CRS), map projections, and geographic and projected coordinates. This issue is also covered in the lecture prior to this course for TUB students.
If a map already has a defined CRS, then you cannot simply overwrite
this! To re-calculate the coordinates from one system into another
system (e.g. from angular coordinates of latitude/ longitude of WGS84 to
a projected CRS with planar coordinates and meter distances, equal area
of Lambert Azimuthal), you need to transform it with the
function project()(raster-data) or
st_transform()(vector-data) (see chapters below).
In the following, we assign the CRS only when no information is
provided. This can be the case when you upload a map based on a simple
data.frame or matrix, i.e. when it is not yet
a spatial object.
crs(<raster>) gives the information of the object
(here: raster layer) Bor_mat or
ras_bio_asc_01, and is also used to set the CRS of the
object via assignment:
crs(<raster>) <- <projection>.
crs(<raster>)shows coordinate reference system of a spatial objectcrs(<raster>) <- <projection>creates projection and sets arguments for CRS!
An easy way to specify the CRS is using a unique identifier system,
where all parameters of the projections are stored. A well-known
identifier (WKID) is the EPSG code. E.g. instead of
"+proj=longlat +datum=WGS84" use
"+init=epsg:4326".
See http://spatialreference.org/. The EPSG code for WGS84 is 4326.
In case the CRS was not yet assigned:
# many ways to assign the CRS, options listed below:
crs(ras_bio_asc_01) <- "+proj=longlat +datum=WGS84"
crs(ras_bio_asc_01) <- "+init=epsg:4326"Projections can also be transferred from one object to another, but only when the CRS hasn’t yet been specified and the coordinates are stemming from the same system, i.e. longitude/ latitude.
In the following, load the topographic map (digital elevation model)
of Borneo (Bor_dem.asc or bio_asc_24.asc) and
assign it the same CRS as Bor_mat, because we know they are
overlaying (you can check that with the raster extents):
## ras_bio_asc_24 = DEM = digital elevation model
ras_bio_asc_24 <- rast(x = paste0(maps_wd, "/bio_asc_24.asc"))
crs(ras_bio_asc_24)## [1] "GEOGCRS[\"WGS 84 (CRS84)\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n CS[ellipsoidal,2],\n AXIS[\"geodetic longitude (Lon)\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n AXIS[\"geodetic latitude (Lat)\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"World\"],\n BBOX[-90,-180,90,180]],\n ID[\"OGC\",\"CRS84\"]]"
## if not assigned, it could also be copied:
# crs(ras_bio_asc_24) <- crs(ras_bio_asc_01)
## same as:
# crs(ras_bio_asc_24) <- crs("+proj=longlat +datum=WGS84")Quickly visualize the data with the easy base plot:
(More plotting later!)
Clip areas
We do this first to work with a small raster to save computation
time. For this, we create a clip_extent based on the spatial
coordinates. The command/ function crop() then clips the
raster map.
## SpatExtent : 108.33412288693, 120.19245688433, -4.3740129986359, 7.5009876663641 (xmin, xmax, ymin, ymax)
clip_extent <- ext(117.2, 117.45, 5.4, 5.5) ## create a subset based on long/ lat coordinates
ras_bio_asc_01_cr <- crop(x = ras_bio_asc_01, y = clip_extent)
plot(ras_bio_asc_01_cr, col = viridis::inferno(10))Accessing SpatRaster objects
First, have a look at the internal data structure of the raster object:
## class : SpatRaster
## dimensions : 12, 30, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 117.2008, 117.4508, 5.400988, 5.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
## source(s) : memory
## varname : bio_asc_01
## name : bio_asc_01
## min value : 261
## max value : 268
## get additional information:
# str(ras_bio_asc_01_cr) ## strucutre of object
# attributes(ras_bio_asc_01_cr) ## available slots
# class(ras_bio_asc_01_cr) ## object type in RThere are many ways to retrieve internal data and access single bits of information:
## SpatExtent : 117.20079005013, 117.45079006413, 5.4009875487641, 5.5009875543641 (xmin, xmax, ymin, ymax)
## [1] 117.2008
## [1] 30
## bio_asc_01
## 1 265
## 2 265
## 3 265
## 4 266
## 5 267
## 6 267
## [1] "GEOGCRS[\"WGS 84\",\n ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n MEMBER[\"World Geodetic System 1984 (Transit)\",\n ID[\"EPSG\",1166]],\n MEMBER[\"World Geodetic System 1984 (G730)\",\n ID[\"EPSG\",1152]],\n MEMBER[\"World Geodetic System 1984 (G873)\",\n ID[\"EPSG\",1153]],\n MEMBER[\"World Geodetic System 1984 (G1150)\",\n ID[\"EPSG\",1154]],\n MEMBER[\"World Geodetic System 1984 (G1674)\",\n ID[\"EPSG\",1155]],\n MEMBER[\"World Geodetic System 1984 (G1762)\",\n ID[\"EPSG\",1156]],\n MEMBER[\"World Geodetic System 1984 (G2139)\",\n ID[\"EPSG\",1309]],\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",7030]],\n ENSEMBLEACCURACY[2.0],\n ID[\"EPSG\",6326]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8901]],\n CS[ellipsoidal,2],\n AXIS[\"longitude\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n AXIS[\"latitude\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"World.\"],\n BBOX[-90,-180,90,180]]]"
## crs = coordinate reference system defined
## CRS creates projection and takes args for crs!
## e.g.
wgs84_crs_args <- CRS("+proj=longlat +datum=WGS84")
wgs84_crs_args ## please check! You will see the WKT representation of the parameters## Coordinate Reference System:
## Deprecated Proj.4 representation: +proj=longlat +datum=WGS84 +no_defs
## WKT2 2019 representation:
## GEOGCRS["unknown",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ID["EPSG",6326]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]],
## CS[ellipsoidal,2],
## AXIS["longitude",east,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]],
## AXIS["latitude",north,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]]]
## [1] 12
## [1] 12 30 1
## [1] 0.008333334 0.008333334
Any idea what this kind of strange cell size could be? We will come to that later.
An important function is crds(), which returns the
centroids of each raster cell:
## x y
## [1,] 117.2050 5.496821
## [2,] 117.2133 5.496821
## [3,] 117.2216 5.496821
## [4,] 117.2300 5.496821
## [5,] 117.2383 5.496821
## [6,] 117.2466 5.496821
## [7,] 117.2550 5.496821
## [8,] 117.2633 5.496821
## [9,] 117.2716 5.496821
## [10,] 117.2800 5.496821
Terrain computation
We will now work with the digital elevation model (DEM)
ras_bio_asc_24 and create a hillshade (3D look) based on
topography using slope and aspect: - simulates a 3D surface - computes
shaded relief values for a raster surface
slope <- terrain(x = ras_bio_asc_24, v = "slope", unit = "radians", neighbors = 8)
aspect <- terrain(x = ras_bio_asc_24, v = "aspect", unit = "radians", neighbors = 8)
Bor_hs <- shade(slope, aspect, angle = 45, direction = 270) Let’s plot the hillshade:
Other very useful terrain calculations are cost surfaces, cost distances and least cost path, e.g. for corridor calculations. We will do that in another course.
Visualising rasters
Above we used the basic plot() function, but there are
pretty alternatives available.
Easy to use: the {tmap} package
Static map
If you want to use a static map as high quality plots for your
publication, save the output via tmap_save().
tmap_mode("plot")
(m <- tm_shape(ras_bio_asc_01) +
tm_raster(palette = "viridis", title = "Mean annual temp"))Make sure that the quality is sufficient by setting the dpi (“dots per inch”) to at least 600. We save this map in our output folder.
tmap_save(m, paste0(output_wd, "/BorneoMap_4326_tmap.png"),
units = "mm", width = 90, height = 90, dpi = 600)The saved map: check output folder
Interactive map
Very good for checking the correct location of the data. Great for zooming in/ out or overlaying with open street map data:
Click on the + | - | layer icons on the upper left and zoom in or out and change the background layers by clicking on the layer icon. The ‘alpha’ argument makes the layer transparent.
If you like, you can save the interactive map — check your output folder for the html file.
Tip of the day: always save the epsg code of your crs at the end of a spatial file name.
tmap_mode("view")
(i.m <- tm_shape(ras_bio_asc_01) +
tm_raster(palette = "viridis", title = "Mean annual temp"))tmap_save(i.m, paste0(output_wd,"/BorneoMAT_4326.html"))
## or use here:
# tmap_save(i.m, here("output","BorneoMAT_4326.html"))
Optional - Visualising rasters with the {ggplot2} package
Visualising rasters with the {ggplot2} package
For advanced R-users familiar with ggplot() coding
style: Beautiful plots can also be made with {ggplot2};
however, SpatRaster objects cannot directly be plotted. One
can use the {stars}
package.
The {stars}
package is another package aimed at working with spatial data,
namely spatio-temporal arrays (such as raster and vector data cubes). We
first turn our raster into a stars object using function
st_as_stars and can then use the geom_stars()
function in combination with ggplot:
## [1] "stars"
g <- ggplot() +
geom_stars(data = stars_bio_asc_01) +
scale_fill_viridis_c(name = "°C * 10", na.value = "transparent") +
coord_sf(crs = "+init=epsg:4326") + ## set correct projection
labs(x = "Longitude", y = "Latitude",
title = "Mean annual temperature") +
theme_minimal() ## set custom plot style
gThe nice thing about ggplot is the flexibility to customize your plot
further. Thus it is very suitable to create static high-quality maps for
publication which can be saved via ggsave(). Make sure that
the quality is sufficient by setting the dpi (“dots per inch”) to at
least 600.
g2 <- g + theme(plot.title = element_text(size = 18, face = "bold"),
plot.title.position = "plot",
legend.position = c(0.2, 0.95),
legend.direction = "horizontal",
legend.key.width = unit(2.2, "lines"),
legend.key.height = unit(0.7, "lines"))
g2ggsave(filename = paste0(output_wd, "/BorneoMap_4326_ggplot.png"),
width = 8, height = 7, dpi = 600, bg = "white")(ggsave() saves automatically the last ggplot. You can
also specify a ggplot object via plot =).
The saved map: check output folder
Further beautiful ideas with ggplot: https://eliocamp.github.io/metR/reference/geom_contour_tanaka.html
End of optional plotting.
3D plot with persp()
# Cool 3D plots with rgl library, e.g. 'rgl.surface'
persp(x = ras_bio_asc_24, xlab = "Easting", ylab = "Northing",
zlab = "elevation", r = 9, d = 1.5, expand = 0.1,
ticktype = "detailed")Why is the map so spiny? Check units of x, y and z coordinates! This is still a geographic CRS, not a projected one. The units are in degrees. Solution: project! (chapter: Raster projection).
Raster projection
Project the grid (the small one!! because for large rasters it may take a long time) to have all units in meters. Remember from basics in GIS: We are now changing from angular units for the x and y coordinate (long/ lat) to a CRS where the x and y coordinates are given in meter units.
Bor_dem_moll <- terra::project(ras_bio_asc_01_cr, "+proj=moll +lat_0=65 +lon_0=10")
persp(Bor_dem_moll, xlab = "Easting", ylab = "Northing",
zlab = "elevation", main = "Elevation model of Borneo",
r = 1, d = 5.5, expand = 0.1, ticktype = "detailed")You can now try all kinds of funny projections on your own, for example, use the Albers Equal Area projection with EPSG code 3085. Give it a try:
Bor_dem_AEA <- terra::project(ras_bio_asc_01_cr, 'EPSG:3085')
persp(Bor_dem_AEA, xlab = "Easting", ylab = "Northing",
zlab = "elevation", main = "Elevation model of Borneo",
r = 1, d = 5.5, expand = 0.1)Raster Stacks
A raster stack is a collection of SpatRaster objects
with the same spatial extent and resolution, similar to a geo-database.
Go into the maps folder and check what is inside:
## gives names and full path of file
files.full <- list.files(path = maps_wd, pattern = '.asc$', full.names = TRUE)
# files.full ## check also
files.full[1:3]## [1] "C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection/data/data_borneo/geo_raster_current_asc/bio_asc_01.asc"
## [2] "C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection/data/data_borneo/geo_raster_current_asc/bio_asc_21.asc"
## [3] "C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection/data/data_borneo/geo_raster_current_asc/bio_asc_22.asc"
## names only
files.rel <- list.files(path = maps_wd, pattern = '.asc$', full.names = FALSE)
files.rel[1:3]## [1] "bio_asc_01.asc" "bio_asc_21.asc" "bio_asc_22.asc"
Tip of the day: Always work with the full path, as you might have stored interim results of maps with the same name in other folders than your target folder. You can avoid using the wrong maps by using the full path specification to the respective data files.
The advantage is that you do not need to apply a command to each
raster map separately, but can do it ‘all in one’. E.g., We can set the
crs of each single raster in just one line. In the following, we
stack four maps in a ‘geodatabase’ called ‘predictors’. In the
following, we select 4 spatial layers, numbers 1, 12, 22 and 24. The
description of what they represent is in the data description .doc in
the map folder. We load them all together with
terra::rast()
predictors <- rast(x = files.full[c(1, 2, 4, 6)])
crs(predictors) #ask whether/ in which CRS the coordinates are in## [1] "GEOGCRS[\"WGS 84 (CRS84)\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n CS[ellipsoidal,2],\n AXIS[\"geodetic longitude (Lon)\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n AXIS[\"geodetic latitude (Lat)\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"World\"],\n BBOX[-90,-180,90,180]],\n ID[\"OGC\",\"CRS84\"]]"
## it is already assigned. If not:
#crs(predictors) <- "+proj=longlat +datum=WGS84" #or# ... <- "EPSG:4326"A raster stack contains the single raster layers in a list:
## class : SpatRaster
## dimensions : 1425, 1423, 4 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)
## sources : bio_asc_01.asc
## bio_asc_21.asc
## bio_asc_24.asc
## bio_asc_42.asc
## names : bio_asc_01, bio_asc_21, bio_asc_24, bio_asc_42
## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)
## source : bio_asc_01.asc
## name : bio_asc_01
## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)
## source : bio_asc_01.asc
## name : bio_asc_01
Plotting stacks
Plotting just one layer:
Plotting all layers:
Hint: With {tmap}, plotting all stack layers at once is
not recommended: {tmap} is standardizing the legend to the
min and max of all layers, which is problematic if maps are not in the
same units. That’s why you cannot see the information in the maps:
Manipulating rasters
Do any kind of raster algebra (summing, dividing,…) with the raster.
Important: if you are adding the values of two SpatRaster
objects, the projection and extent must be the same.
For changing cell size use aggregate() or resample():
## collapse 20*20 cells into 1 using function (fun) 'mean':
ras_bio_asc_01_agg <- aggregate(x = ras_bio_asc_01, fact = 20, fun = mean) We finally plot the two new rasters next to each other:
par(mfrow = c(1,2)) # recap from course 1: plots in 1 row and 2 columns
plot(x = new_ras, col = rev(rainbow(5)))
plot(ras_bio_asc_01_agg)Queries on raster values
You can easily access the cell information of the raster via
values()
## convert ras_bio_asc_01 to degree Celsius units (divide by 10)
range(values(ras_bio_asc_01), na.rm = TRUE)## [1] 63 278
Visualise range of the values with a simple boxplot (see course 1 R Intro).
Similarly, we can look at the distribution of the values in each single layer of a raster stack.
Example: Find areas with mean annual temp above 25 degrees Celsius
## first divide by 10 to get 'realistic' temperature ranges. Remember the values
## of the raster had integer values because this data type uses less computer memory
## than a floating-point number (meaning numbers with digits)
mean.t.c <- ras_bio_asc_01 / 10
range(values(mean.t.c), na.rm = TRUE)## [1] 6.3 27.8
## now ask which ones are above 25 degrees Celsius
mean.t.c.25 <- mean.t.c >= 25
mean.t.c.25 ## check the values -> binary! ## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
## source(s) : memory
## varname : bio_asc_01
## name : bio_asc_01
## min value : FALSE
## max value : TRUE
Have a look at the result using a binary plot, i.e.all areas where the temperature is above 25 degrees (TRUE) are plotted:
Retrieve metrics/ statistics from the SpatRaster objects
with function global():
## mean
## bio_asc_01 255.61
## bio_asc_21 0.02
## bio_asc_24 251.75
## bio_asc_42 10.65
Optional - pretty violin plots in ggplot2
With the help of the {rasVis} package we can plot violin
charts bwplot() to visualize the summary statistics of a
raster across all raster cells:
However, these plots are not really pretty. More beautiful violin
plots can be found in ggplot2. For this, we need to transform the
SpatRaster data into a data.frame first:
## first omit NA values (which represent the ocean around Borneo)
raster_df <- na.omit(data.frame(values(predictors[[c(1,3)]])))
raster_names <- names(raster_df)
raster_ct <- dim(raster_df)[1]
df2 <- data.frame(val = c(raster_df[,names(raster_df)[1]],
raster_df[,names(raster_df)[2]]))
df2$grp <- rep(raster_names , each = raster_ct)
head(df2)## val grp
## 1 271 bio_asc_01
## 2 271 bio_asc_01
## 3 270 bio_asc_01
## 4 270 bio_asc_01
## 5 270 bio_asc_01
## 6 270 bio_asc_01
## take a random subsample of the data to not crash your PC when plotting:
a <- sample(x = nrow(df2), size = 1000, replace = FALSE)
df3 <- df2[a,]## a violin-box plot combination with raw data strips
p <- ggplot(data = df3, aes(x = grp, y = val)) +
geom_violin(scale = "width", fill = "grey85", color = "#3366FF", bw = 20) +
geom_boxplot(width = 0.15, size = 0.8, outlier.color = NA) + ## remove outliers
geom_jitter(height = 0, width = 0.05, alpha = 0.2, size = 1.5, color = 'blue')
pSince the units are so different, it makes more sense in this case to
plot them separately, using a facet_wrap() and
scales = free.
p +
facet_wrap(vars(grp), scales = "free") +
scale_x_discrete(guide = "none") + ## remove axis ticks and labels on x
labs(x = NULL, y = "Value") +
theme_minimal(base_size = 15) ## set custom plot style# save the last plot
ggsave(paste0(output_wd, "/savedggplot.pdf"), width = 5, height = 5, dpi = 600)
## or use here:
# ggsave(here("output", "savedggplot.pdf"), width = 5, height = 5, dpi = 600)…or plot them separately and combine them using the {patchwork}
package:
df3_bio1 <- subset(df3, df3$grp == 'bio_asc_01')
df3_bio24 <- subset(df3, df3$grp == 'bio_asc_24')
p1 <- ggplot(data = df3_bio1, aes(x = grp, y = val)) +
geom_violin(scale = "width", fill = "grey85", colour = "#3366FF") +
geom_boxplot(width = 0.2, size = 0.8, outlier.color = NA) + ## remove outliers
geom_jitter(height = 0, width = 0.05, alpha = 0.2, size = 1.5, colour = "#3366FF") +
labs(x = NULL, y = "Value")
p2 <- ggplot(data = df3_bio24, aes(x = grp, y = val)) +
geom_violin(scale = "width", fill = "grey85", colour = "#3366FF") +
geom_boxplot(width = 0.2, size = 0.8, outlier.color = NA) + ## remove outliers
geom_jitter(height = 0, width = 0.05, alpha = 0.2, size = 1.5, colour = "#3366FF") +
labs(x = NULL, y = "")
p1check here for how {ggplot2} works if you want to make
really nice plots: * https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf
or get inspiration here:
- https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ *https://www.r-graph-gallery.com/
Back to spatial R!
Accessing and changing single raster cell values
Remember the function extract(). In the
following, we extract from the small raster the first three values (=
cols 1-3; longitude or x-axis) at line 5 (= row 5; latitude or
y-axis).
cells <- cellFromRowCol(object = ras_bio_asc_01_cr, row = 5, col = 1:3)
cells ## Note: returns cell ID number = the index / row number! Not the value!!!!## [1] 121 122 123
## bio_asc_01
## 1 266
## 2 265
## 3 266
## plot it:
plot(x = ras_bio_asc_01_cr)
## to plot the points on top, insert the 'cells' index into
## the data frame of the coordinates (crds) of the SpatRaster object ras_bio_asc_01_cr.
## Do you fully understand the following line?:
points(x = crds(ras_bio_asc_01_cr)[cells,], col = "blue")Change raster values in an existing raster:
## get the coordinates; cells was the object containing the index (i.e. the cell numbering)
## the function returns a matrix (it could have been easy...)
xy <- xyFromCell(object = ras_bio_asc_01_cr, cell = cells)
xy## x y
## [1,] 117.2050 5.463488
## [2,] 117.2133 5.463488
## [3,] 117.2216 5.463488
## [1] "matrix" "array"
## bio_asc_01
## 1 266
## 2 265
## 3 266
## Change values of a raster, e.g. for adding forest or creating a corridor
## take care! -> irreversible! better work on a copy!
copy_ras <- ras_bio_asc_01_cr
copy_ras[cells] <- 250 # here we set all values in the raster at the position index cells to 250
plot(x = copy_ras, col = viridis(20))Optional - extract values from a stack
Advantage of working with stacks: Extract raster values from all
SpatRaster objects at once (raster stack). This is very
important if you want to create your master table, i.e. based on the
xy-coordinates of your species sightings (i.e. GPS point locations), you
could extract the underlying values of all environmental predictors at
that location in one go.
We do that for an area in the center of Borneo. For this, first get
the row and column indices of the center of a large Borneo map,
e.g. ras_bio_asc_01, and 5 x 5 cells in addition, i.e. a
square with an area of 5 km x 5 km:
## divide by 2 to get the center cell
center_x = floor(nrow(predictors) / 2) ## learn about the functions round(),
center_y = floor(ncol(predictors) / 2) ## ceiling(), floor()
center_x; center_y ## center coordinates of the Borneo maps## [1] 712
## [1] 711
## start from center cell and go 5 cells in x and y direction
## This should yeald 25 cells
stack_cells <- terra::cellFromRowColCombine(
object = ras_bio_asc_01,
row = c(center_x:(center_x + 4)),
col = c(center_y:(center_y + 4))
)
## check
#?cellFromRowColCombine
## to get an overview over all possible functions to get acces to the raster
head(stack_cells) ## mind: these are index numbers! Not cell values!## [1] 1012464 1012465 1012466 1012467 1012468 1013887
Why are these index numbers so large? Because the counting starts at
the upper left corner of the map and continues consecutively, i.e. the
data (ras_bio_asc_01@data@values) are one huge vector with
length nrow * ncol!
Now, extract them with the useful function èxtract()
into an object called ‘mastertable’ that you can for example use for
statistical analyses in R:
## bio_asc_01 bio_asc_21 bio_asc_24 bio_asc_42
## 1 255 0.000000000 303 1
## 2 252 0.000000000 325 1
## 3 253 0.000000000 284 1
## 4 254 0.000000000 225 1
## 5 254 0.000000000 351 1
## 6 253 0.008333334 299 1
## stack_cells bio_asc_01 bio_asc_21 bio_asc_24 bio_asc_42
## 1 1012464 255 0.000000000 303 1
## 2 1012465 252 0.000000000 325 1
## 3 1012466 253 0.000000000 284 1
## 4 1012467 254 0.000000000 225 1
## 5 1012468 254 0.000000000 351 1
## 6 1013887 253 0.008333334 299 1
End optional.
Compute distance to points
From the three selected cells in the small raster ‘ras_bio_asc_01_cr’, we want to calculate the distance to each other cell in the raster.
## remember from above: the object 'xy' was a matrix ## check #class(xy)
## turn first into SpatVector - this is the format needed for the distance() function below
## we do this just for 1 cell (point),
sv_xy <- vect(xy)[1]
sv_xy## class : SpatVector
## geometry : points
## dimensions : 1, 0 (geometries, attributes)
## extent : 117.205, 117.205, 5.463488, 5.463488 (xmin, xmax, ymin, ymax)
## coord. ref. :
Check the object sv_xy: since xy was a
simple matrix and not a spatial object, there is no crs assigned. The
slot for ‘coord.ref.’ is empty. That means, we now have to first assign
the CRS so that the following functions know where exactly the different
layers are on earth.
## class : SpatVector
## geometry : points
## dimensions : 1, 0 (geometries, attributes)
## extent : 117.205, 117.205, 5.463488, 5.463488 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
Now the CRS is set/ assigned and we can continue with distance calculations:
##
|---------|---------|---------|---------|
=========================================
Calculate distance from many points:
cells1 <- c(cells, 250, 360) ## add more randomly chosen points to cells = cells1
sv_xy_2 <- vect(xyFromCell(ras_bio_asc_01_cr, cells1)) ## we create a new spatVect
crs(sv_xy_2) <- "epsg:4326" ## same procedure as above
my_dist <- distance(x = ras_bio_asc_01_cr, y = sv_xy_2)##
|---------|---------|---------|---------|
=========================================
Data export - saving raster data
Save the raster (not the plot…): There are many exchange formats for rasters. The best choice is considered to be GeoTiff, which also saves the projection and is smaller than ascii. However, for modelling e.g. in MaxEnt, the raster maps are needed in ascii-format. Also, ascii type files can be opened easily in any text editor. Try this with the small cropped raster.
## save the small cropped file
terra::writeRaster(x = ras_bio_asc_01_cr,
filename = paste0(output_wd,"/bor_crop.asc"),
# or use here(): here("output", "bor_crop.asc")
overwrite = TRUE,
NAflag = -9999)
terra::writeRaster(x = ras_bio_asc_01_agg,
filename = paste0(output_wd,"/ras_bio_asc_01_agg.asc"),
# or use here(): here("output", "ras_bio_asc_01_agg.asc")
overwrite = TRUE,
NAflag = -9999)Vector data / shapefiles
Import shapefiles
Currently, there are two packages sp and sf (standing for simple features), both with still important functionality. However, sf is much easier to use and handle. The suggestion currently is: learn both!
In the following, the most important commands are provided for the sf-package, and if necessary, also for the sp-package. We start with loading an ESRI shapefile.
## Border of countries and provinces of Borneo
## vecs_wd is the folder where vector data are stored
## - only loading columns 1:3, 5, 7, 17, 18 of its attribute table (a. t.):
Borneo_shp <- st_read(dsn = vecs_wd, layer = "borneo_admin")[, c(1:3, 5, 7, 17, 18)] ## Reading layer `borneo_admin' from data source
## `C:\Users\kramer\_github_lokal_NB0144\d6_teaching_collection\data\data_borneo\geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 158 features and 18 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5986 ymin: -4.943054 xmax: 119.2697 ymax: 7.380556
## Geodetic CRS: WGS 84
## Protected Areas (National Parks, Nature Reserves, Forest Reserves)
PA_shp <- st_read(dsn = vecs_wd,
layer = "Bor_PA")[, c(1:4)]## Reading layer `Bor_PA' from data source
## `C:\Users\kramer\_github_lokal_NB0144\d6_teaching_collection\data\data_borneo\geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 255 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5969 ymin: -4.18356 xmax: 119.6176 ymax: 9.911229
## Geodetic CRS: WGS 84
## fix problematic polygons
PA_shp <- st_make_valid(PA_shp)
## main rivers
River_shp <- st_read(dsn = vecs_wd, layer = "sn_100000")## Reading layer `sn_100000' from data source
## `C:\Users\kramer\_github_lokal_NB0144\d6_teaching_collection\data\data_borneo\geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 396 features and 4 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 108.9454 ymin: -3.787917 xmax: 118.7879 ymax: 6.464583
## Geodetic CRS: WGS 84
## Reading layer `borneo_admin' from data source
## `C:\Users\kramer\_github_lokal_NB0144\d6_teaching_collection\data\data_borneo\geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 158 features and 18 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5986 ymin: -4.943054 xmax: 119.2697 ymax: 7.380556
## Geodetic CRS: WGS 84
More here: - https://cran.r-project.org/web/packages/sf/vignettes/sf1.html
Transformations between simple feature {sf} and
SpatialPolygonsDataFrame {sp}
Since some functions are still only available in the
{sp} package, it can be helpful to remember how to back and
forth-transform between the data types.
Import spatial points
In the following, we load two point datasets of different species locations.
## Reading layer `FCsim' from data source
## `C:\Users\kramer\_github_lokal_NB0144\d6_teaching_collection\data\data_borneo\animal_data\FCsim.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 55 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 109.3758 ymin: -2.744159 xmax: 118.4117 ymax: 5.824283
## CRS: NA
## Simple feature collection with 55 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 109.3758 ymin: -2.744159 xmax: 118.4117 ymax: 5.824283
## CRS: NA
## First 10 features:
## Id POINT_X POINT_Y geometry
## 1 0 118.4117 5.221667 POINT (118.4117 5.221667)
## 2 0 118.2645 5.005603 POINT (118.2645 5.005603)
## 3 0 117.7992 5.029167 POINT (117.7992 5.029167)
## 4 0 117.6058 5.177957 POINT (117.6058 5.177957)
## 5 0 117.5566 5.381088 POINT (117.5566 5.381088)
## 6 0 117.4704 5.005603 POINT (117.4704 5.005603)
## 7 0 117.2734 5.227201 POINT (117.2734 5.227201)
## 8 0 116.7256 5.319533 POINT (116.7256 5.319533)
## 9 0 116.6467 5.162500 POINT (116.6467 5.1625)
## 10 0 117.2475 4.888333 POINT (117.2475 4.888333)
## Simple feature collection with 55 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 109.3758 ymin: -2.744159 xmax: 118.4117 ymax: 5.824283
## Geodetic CRS: WGS 84
## First 10 features:
## Id POINT_X POINT_Y geometry
## 1 0 118.4117 5.221667 POINT (118.4117 5.221667)
## 2 0 118.2645 5.005603 POINT (118.2645 5.005603)
## 3 0 117.7992 5.029167 POINT (117.7992 5.029167)
## 4 0 117.6058 5.177957 POINT (117.6058 5.177957)
## 5 0 117.5566 5.381088 POINT (117.5566 5.381088)
## 6 0 117.4704 5.005603 POINT (117.4704 5.005603)
## 7 0 117.2734 5.227201 POINT (117.2734 5.227201)
## 8 0 116.7256 5.319533 POINT (116.7256 5.319533)
## 9 0 116.6467 5.162500 POINT (116.6467 5.1625)
## 10 0 117.2475 4.888333 POINT (117.2475 4.888333)
Spatial points are defined by two columns with x and y coordinates;
they can be imported as normal data.frames and then
converted to spatial objects:
pt_file <- paste0(anim_wd, "/MyNewSpecies.csv")
df_recs <- read.table(file = pt_file, header = TRUE, sep = ',') # animal records
class(x = df_recs)## [1] "data.frame"
## species long lat
## 1 mysimsp 109.4716 -1.11984615
## 2 mysimsp 109.3216 -0.02817942
## 3 mysimsp 117.0383 3.60515411
## 4 mysimsp 109.1550 0.09682059
## 5 mysimsp 111.1550 1.78848735
## 6 mysimsp 115.4550 4.93848752
This is still a data.frame, not yet a spatial object.
See below of how to convert a data.frame into a spatial
{sf} object. You need to give the columns of your file that
contain the x and y coordinates, and you need to set the CRS.
Working with vector data
Explore the objects and retrievable information:
## Classes 'sf' and 'data.frame': 255 obs. of 5 variables:
## $ SITE_ID : int 785 787 791 795 1300 783 784 786 788 789 ...
## $ NAME_ENG: chr "Kinabalu" "Mulu" "Niah" "Tawau Hill Park" ...
## $ COUNTRY : chr "Malaysia" "Malaysia" "Malaysia" "Malaysia" ...
## $ SUB_NAT : chr "Sabah" "Sarawak" "Sarawak" "Sabah" ...
## $ geometry:sfc_MULTIPOLYGON of length 255; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:322, 1:2] 117 117 117 117 117 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA
## ..- attr(*, "names")= chr [1:4] "SITE_ID" "NAME_ENG" "COUNTRY" "SUB_NAT"
## $names
## [1] "SITE_ID" "NAME_ENG" "COUNTRY" "SUB_NAT" "geometry"
##
## $row.names
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
## [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
## [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
## [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
## [91] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
## [109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
## [127] 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
## [145] 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
## [163] 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
## [181] 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
## [199] 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216
## [217] 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234
## [235] 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252
## [253] 253 254 255
##
## $sf_column
## [1] "geometry"
##
## $agr
## SITE_ID NAME_ENG COUNTRY SUB_NAT
## <NA> <NA> <NA> <NA>
## Levels: constant aggregate identity
##
## $class
## [1] "sf" "data.frame"
Accessing simple feature objects — easy handling, similar to
data.frames:
## SpatExtent : 108.5968696, 119.6176097, -4.1835603, 9.9112287 (xmin, xmax, ymin, ymax)
## [1] 108.5969
## [1] "SITE_ID" "NAME_ENG" "COUNTRY" "SUB_NAT" "geometry"
Summarizing spatial information for each column of attribute table [a.t.]:
## SITE_ID NAME_ENG COUNTRY SUB_NAT
## Min. : 783 Length:255 Length:255 Length:255
## 1st Qu.: 3844 Class :character Class :character Class :character
## Median : 9842 Mode :character Mode :character Mode :character
## Mean : 45309
## 3rd Qu.: 19582
## Max. :901238
## geometry
## MULTIPOLYGON :255
## epsg:4326 : 0
## +proj=long...: 0
##
##
##
Have a look at the content of the attribute table. It looks like a
data.frame, but has the geometry stored as last column
(recap from the introductory lecture).
## Simple feature collection with 6 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 111.8754 ymin: 1.32105 xmax: 118.7899 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## 1 785 Kinabalu Malaysia Sabah MULTIPOLYGON (((116.5979 6....
## 2 787 Mulu Malaysia Sarawak MULTIPOLYGON (((114.9033 4....
## 3 791 Niah Malaysia Sarawak MULTIPOLYGON (((113.7986 3....
## 4 795 Tawau Hill Park Malaysia Sabah MULTIPOLYGON (((117.8497 4....
## 5 1300 Lanjak-Entimau Malaysia Sarawak MULTIPOLYGON (((112.2054 1....
## 6 783 Semporna Malaysia Sabah MULTIPOLYGON (((118.7796 4....
Retrieving information of a.t. — recap working with
data.frames from Day1_R-Intro course.
## Simple feature collection with 1 feature and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 116.4638 ymin: 5.99952 xmax: 116.7631 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## 1 785 Kinabalu Malaysia Sabah MULTIPOLYGON (((116.5979 6....
## Simple feature collection with 255 features and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5969 ymin: -4.18356 xmax: 119.6176 ymax: 9.911229
## Geodetic CRS: WGS 84
## First 10 features:
## NAME_ENG geometry
## 1 Kinabalu MULTIPOLYGON (((116.5979 6....
## 2 Mulu MULTIPOLYGON (((114.9033 4....
## 3 Niah MULTIPOLYGON (((113.7986 3....
## 4 Tawau Hill Park MULTIPOLYGON (((117.8497 4....
## 5 Lanjak-Entimau MULTIPOLYGON (((112.2054 1....
## 6 Semporna MULTIPOLYGON (((118.7796 4....
## 7 Samunsam MULTIPOLYGON (((109.6615 1....
## 8 Similajau MULTIPOLYGON (((113.1826 3....
## 9 Klias MULTIPOLYGON (((115.636 5.3...
## 10 Pulau Tiga MULTIPOLYGON (((115.6796 5....
## [1] "Kinabalu" "Mulu" "Niah" "Tawau Hill Park"
## [5] "Lanjak-Entimau" "Semporna"
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 116.4638 ymin: 5.99952 xmax: 116.7631 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID geometry
## 1 785 MULTIPOLYGON (((116.5979 6....
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 114.7777 ymin: 3.9186 xmax: 114.9945 ymax: 4.28586
## Geodetic CRS: WGS 84
## COUNTRY geometry
## 2 Malaysia MULTIPOLYGON (((114.9033 4....
The following returns the row indices of the data frame or a.t., respectively:
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 26 27 28 29 30
## [19] 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
## [37] 49 50 51 52 53 54 55 56 63 64 65 66 67 68 69 70 71 72
## [55] 73 74 75 76 77 80 81 101 102 103 104 113 114 115 116 117 118 119
## [73] 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
## [91] 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155
## [109] 156 157 158 159 161 162 163 164 165 166 167 169 170 171 172 185 186 187
## [127] 190 191 192 193 194 195 196 221 222 223 224 225 226 227 228 230 233 237
## [145] 238 239 240 241 249 250 251 254 255
Visualising vector data
plot()
It is ok to use a simple plot() function for
{sf} objects. However, mind that this object has 7 columns,
so each column holds information that will be plotted, i.e. you will
plot 7 single plots. So do a little transformation: either plot just one
column, or define it as {sp} object:
Plotting {sp} objects:
## Defining it as(..., "Spatial"), i.e. transforming it to a `{sp}` object (see above)
plot(as(Borneo_shp, "Spatial")) ## plot geometry without information of a colPoints in data.frames can be plotted, as long as the
coordinates are the same, can also be plotted in space. We first plot
the polygon, then add the points (note, one is a simple
data.frame, the other is a shapefile):
plot(x = as(Borneo_shp, "Spatial"), col = 'grey', border = 'white') ## polygon
points(x = df_recs$long, df_recs$lat, cex = 0.5, pch = 15) ## simple d.f.!
plot(x = as(pt_shp, "Spatial"), col = 'blue', add = TRUE) {ggplot2} package
{ggplot2} is currently the best way to plot
{sf} objects: https://www.r-spatial.org/r/2018/10/25/ggplot2-sf.html.
The parameter geom_sf() is set here.
ggplot(data = PA_shp) +
geom_sf(fill = "chartreuse3", color = NA) +
labs(x = "Longitude", y = "Latitude",
title = "Protected areas") +
theme_minimal(base_size = 15) ## set custom plot styleNote that it does NOT work for {sp} objects
(SpatialPolygonsDataFrame).
{tmap} package
Static map
The site ID is colour-coded here in a gradient
tmap_mode(mode = "plot")
tm_shape(shp = PA_shp[, 1]) +
tm_polygons(col = "SITE_ID", palette = grDevices::terrain.colors(5))You can also plot several layers or highlighted details. For example, we can plot a single selected protected area. Administrative area borders are black, protected areas green, and the single selected one is red.
tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp) +
tm_polygons(border.col = "black") +
tm_shape(PA_shp[,1]) +
tm_polygons(border.col = "deepskyblue4") +
tm_shape(PA_shp[1, ]) +
tm_polygons(border.col = "red")Add spatial points (note: it does not work with
data.frames) to the plot.
tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp) +
tm_polygons(col = "grey", border.col = "white") +
tm_shape(shp = recs_sf) +
tm_dots(shape = 3, size = 0.25) +
tm_shape(shp = pt_shp) +
tm_dots(col = "blue") Interactive map
Saving vector data
Work with subsets, select Malaysia, save and plot as shapefile with st_write():
## select the protected areas of Malaysia only:
Mal_PA_shp <- subset(PA_shp, PA_shp$COUNTRY == 'Malaysia')
st_write(obj = Mal_PA_shp,
dsn = output_wd,
layer = 'ProtectedAreasMalaysia',
driver = 'ESRI Shapefile',
delete_layer = TRUE)## Deleting layer `ProtectedAreasMalaysia' using driver `ESRI Shapefile'
## Writing layer `ProtectedAreasMalaysia' to data source
## `C:/Users/kramer/_github_lokal_NB0144/d6_teaching_collection/output' using driver `ESRI Shapefile'
## Writing 153 features with 4 fields and geometry type Multi Polygon.
Geospatial calculation
This refers to the GIS functions of area calculations, distance
calculations, spatial overlays and intersections. The new
{sf} package has the same functionality, with functions
starting with st_, e.g. st_buffer(),
st_intersection() etc.
- https://geocompr.robinlovelace.net/spatial-operations.html
- https://r.geocompx.org/geometry-operations#clipping
Area calculation
Returns the area in square meters for all features:
## Units: [m^2]
## [1] 7651059265 1238543644 486042067 6128271711 1468016456 2955188183
Hint: you can also calculate the area for single features (e.g. one polygon).
## 486042067 [m^2]
Example: Calculate area of Malaysia in km²
Mal_Borneo_shp <- subset(Borneo_shp, Borneo_shp$NAME_0 == 'Malaysia')
head(st_area(x = Mal_Borneo_shp) / 1000000) ## or / 1e6## Units: [m^2]
## [1] 7651.0593 1238.5436 486.0421 6128.2717 1468.0165 2955.1882
## better: use set_units to change the units from m^2 to km^2
Mal_Borneo_shp$area <- units::set_units(x = st_area(x = Mal_Borneo_shp), value = km^2)
head(Mal_Borneo_shp$area)## Units: [km^2]
## [1] 7651.0593 1238.5436 486.0421 6128.2717 1468.0165 2955.1882
CRS projection
Recap: If no CRS is assigned for a file, the CRS can be set with
st_crs(). However, if a file has already a CRS, e.g. WGS84
(angular units, longitude and latitude), and you want to change into a
projection with planar units (e.g. meters), use
st_transform().
Borneo_shp_moll <- st_transform(Borneo_shp, c("+proj=moll +datum=WGS84"))
class(Borneo_shp_moll) # sf object, data.frame!## [1] "sf" "data.frame"
Plot it and have a look at the changes:
tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp_moll) +
tm_polygons(border.col = "blue") ## do you see the difference?We can also use pretty {ggplot2} with {sf}
objects:
ggplot(Borneo_shp_moll) +
geom_sf(color = "blue") +
theme_minimal(base_size = 15) ## set custom plot styleNote the change in the area calculation!
## Units: [m^2]
## [1] 7666715202 1241315346 487129643 6141990607 1471302821 2961802507
## Units: [m^2]
## [1] 7651059265 1238543644 486042067 6128271711 1468016456 2955188183
Example: Add the area of a polygon as new column to a.t.
## create the column 'area_km2'
area_km2 <- set_units(x = st_area(x = Borneo_shp_moll, byid = TRUE), value = km^2)
Borneo_shp_moll = data.frame(Borneo_shp_moll, area_km2)
head(x = Borneo_shp_moll)## ID_0 ISO NAME_0 NAME_1 NAME_2 Shape_Leng Shape_Area
## 1 158 MYS Malaysia Sabah Lahad Datu 6.7475367 0.62106012
## 2 158 MYS Malaysia Sabah Papar 1.6727672 0.10065591
## 3 158 MYS Malaysia Sabah Penampang 0.9548857 0.03951545
## 4 158 MYS Malaysia Sabah Pensiangan 4.0257859 0.49725547
## 5 158 MYS Malaysia Sabah Pitas 2.9223456 0.11955352
## 6 158 MYS Malaysia Sabah Ranau 2.4815704 0.24027591
## geometry area_km2
## 1 MULTIPOLYGON (((11824461 58... 7666.7152 [km^2]
## 2 MULTIPOLYGON (((11590069 72... 1241.3153 [km^2]
## 3 MULTIPOLYGON (((11592912 72... 487.1296 [km^2]
## 4 MULTIPOLYGON (((11651495 62... 6141.9906 [km^2]
## 5 MULTIPOLYGON (((11703665 83... 1471.3028 [km^2]
## 6 MULTIPOLYGON (((11675075 77... 2961.8025 [km^2]
Spatial overlay and intersections
Points and polygons with st_intersection()
Now we want to know which of our species observations (recs_sf) were
taken in protected areas (PA_shp). For this, we use
st_intersection():
## retrieve the geometry (location) indices of PA_shp at
## the locations of sp_recs: which points are in PA_shp
nrow(recs_sf) ## 500## [1] 500
## [1] 11
Point and raster data with extract()
Then we want to know in which climatic zone we have observed the species ‘recs_sf’. For this, we extract the information from the raster of mean annual temperature (ras_bio_asc_01) to the point locations of species observations:
# for a RASTER: extract mean ann. temp. from ras_bio_asc_01
# and add it to a.t.of the locations/ points
mean_t <- extract(x = ras_bio_asc_01, y = vect(recs_sf)) ## for {terra} we need to wrap the sf object into spatial vector with`vect()`
recs_sf$mean_t <- mean_t$bio_asc_01
mean(x = recs_sf$mean_t) # hist(sp_recs_sf$mean_t)## [1] 272.7788
Polygon and raster data with extract()
Finally, we want to know in which elevation range our protected areas are located. Extract elevation per PA and save average to attribute table of the shapefile/ vector data as a new column. The elevation raster was uploaded as ras_bio_asc_24 (see above, raster data section).
fewPA <- Mal_PA_shp[c(1:5), 1] ## select only 5 PAs to speed up computation
## returns a data.frame — each ID contains multiple elevation raster cells (ras_bio_asc_24).
## the shapefile has to be turned into a spatial vector
tmp <- extract(x = ras_bio_asc_24, y = vect(fewPA), xy = TRUE)
str(tmp) ## overview over the 'data.frame' object. It should have 5 IDs for example. Let's check:## 'data.frame': 3681 obs. of 4 variables:
## $ ID : num 1 1 1 1 1 1 1 1 1 1 ...
## $ bio_asc_24: int 286 491 508 500 775 986 645 582 335 675 ...
## $ x : num 117 117 117 117 117 ...
## $ y : num 6.49 6.49 6.48 6.48 6.48 ...
## [1] 1 2 3 4 5
## As a protector area feature of the shapefile comprises several elevation raster cells,
## they are all appended in the extracted data.frame
head(tmp)## ID bio_asc_24 x y
## 1 1 286 116.5800 6.488488
## 2 1 491 116.5883 6.488488
## 3 1 508 116.5800 6.480154
## 4 1 500 116.5883 6.480154
## 5 1 775 116.5966 6.480154
## 6 1 986 116.6050 6.480154
## use tapply (see course 1 - R Intro) to check the values.
## The protected area with ID = 1 seems to comprise high elevation
PA_elev <- tapply(X=tmp$bio_asc_24, INDEX = tmp$ID, FUN = mean)
PA_elev## 1 2 3 4 5
## 1348.4446 554.8083 107.7250 512.3037 403.9674
Append mean elevation to the shapefile:
## ...or use function 'aggregate()'. this returns a `data.frame`.
## The elevation info is in column 'x'
mean_tmp <- terra::aggregate(tmp$bio_asc_24, by = list(Category = tmp$ID), FUN = mean)
## Since both objects, the shapefile 'fewPA' and the data.frame 'mean_tmp' (or 'PA_elev')
## are ordered by ID,
## you can directly append the information to the attribute table of 'fewPA'
fewPA$mean_elevation <- mean_tmp$x
#fewPA$mean_elevation <- PA_elev ## alternatively
fewPA## Simple feature collection with 5 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 111.8754 ymin: 1.32105 xmax: 118.0747 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID geometry mean_elevation
## 1 785 MULTIPOLYGON (((116.5979 6.... 1348.4446
## 2 787 MULTIPOLYGON (((114.9033 4.... 554.8083
## 3 791 MULTIPOLYGON (((113.7986 3.... 107.7250
## 4 795 MULTIPOLYGON (((117.8497 4.... 512.3037
## 5 1300 MULTIPOLYGON (((112.2054 1.... 403.9674
Spatial data sources in the web
{rnaturalearth}
NaturalEarth (http://www.naturalearthdata.com) is a public domain map data set that features vector and raster data of physical and cultural properties. It is available at 1:10m, 1:50m, and 1:110 million scales.
{rnaturalearth}
(https://docs.ropensci.org/rnaturalearth) is an R package
to hold and facilitate interaction with NaturalEarth (ne_…) map
data.
After loading the package, you can for example quickly access
shapefiles of all countries that are already projected and can be stored
as either sp or sf objects:
library(rnaturalearth)
## store as sf object (simple features)
world <- ne_countries(returnclass = "sf")
class(world)## [1] "sf" "data.frame"
## $input
## [1] "WGS 84"
This country data set (which is actually not downloaded but stored locally by installing the package) already contains several useful variables, mostly referring to different naming conventions (helpful when joining with other data sets), to identify continents and regions, and also some information on population size, GDP, and economy:
## [1] "featurecla" "scalerank" "labelrank" "sovereignt" "sov_a3"
## [6] "adm0_dif" "level" "type" "tlc" "admin"
## [11] "adm0_a3" "geou_dif" "geounit" "gu_a3" "su_dif"
## [16] "subunit" "su_a3" "brk_diff" "name" "name_long"
## [21] "brk_a3" "brk_name" "brk_group" "abbrev" "postal"
## [26] "formal_en" "formal_fr" "name_ciawf" "note_adm0" "note_brk"
## [31] "name_sort" "name_alt" "mapcolor7" "mapcolor8" "mapcolor9"
## [36] "mapcolor13" "pop_est" "pop_rank" "pop_year" "gdp_md"
## [41] "gdp_year" "economy" "income_grp" "fips_10" "iso_a2"
## [46] "iso_a2_eh" "iso_a3" "iso_a3_eh" "iso_n3" "iso_n3_eh"
## [51] "un_a3" "wb_a2" "wb_a3" "woe_id" "woe_id_eh"
## [56] "woe_note" "adm0_iso" "adm0_diff" "adm0_tlc" "adm0_a3_us"
## [61] "adm0_a3_fr" "adm0_a3_ru" "adm0_a3_es" "adm0_a3_cn" "adm0_a3_tw"
## [66] "adm0_a3_in" "adm0_a3_np" "adm0_a3_pk" "adm0_a3_de" "adm0_a3_gb"
## [71] "adm0_a3_br" "adm0_a3_il" "adm0_a3_ps" "adm0_a3_sa" "adm0_a3_eg"
## [76] "adm0_a3_ma" "adm0_a3_pt" "adm0_a3_ar" "adm0_a3_jp" "adm0_a3_ko"
## [81] "adm0_a3_vn" "adm0_a3_tr" "adm0_a3_id" "adm0_a3_pl" "adm0_a3_gr"
## [86] "adm0_a3_it" "adm0_a3_nl" "adm0_a3_se" "adm0_a3_bd" "adm0_a3_ua"
## [91] "adm0_a3_un" "adm0_a3_wb" "continent" "region_un" "subregion"
## [96] "region_wb" "name_len" "long_len" "abbrev_len" "tiny"
## [101] "homepart" "min_zoom" "min_label" "max_label" "label_x"
## [106] "label_y" "ne_id" "wikidataid" "name_ar" "name_bn"
## [111] "name_de" "name_en" "name_es" "name_fa" "name_fr"
## [116] "name_el" "name_he" "name_hi" "name_hu" "name_id"
## [121] "name_it" "name_ja" "name_ko" "name_nl" "name_pl"
## [126] "name_pt" "name_ru" "name_sv" "name_tr" "name_uk"
## [131] "name_ur" "name_vi" "name_zh" "name_zht" "fclass_iso"
## [136] "tlc_diff" "fclass_tlc" "fclass_us" "fclass_fr" "fclass_ru"
## [141] "fclass_es" "fclass_cn" "fclass_tw" "fclass_in" "fclass_np"
## [146] "fclass_pk" "fclass_de" "fclass_gb" "fclass_br" "fclass_il"
## [151] "fclass_ps" "fclass_sa" "fclass_eg" "fclass_ma" "fclass_pt"
## [156] "fclass_ar" "fclass_jp" "fclass_ko" "fclass_vn" "fclass_tr"
## [161] "fclass_id" "fclass_pl" "fclass_gr" "fclass_it" "fclass_nl"
## [166] "fclass_se" "fclass_bd" "fclass_ua" "geometry"
We can quickly plot it:
NOTE: Unfortunately, NaturalEarth is using weird de-facto and on-the-ground rules to define country borders which do not follow the borders the UN and most countries agree on. For correct and official borders, please use the {rgeoboundaries} package (see below).
You can specify the scale, category and type you want as in the examples below.
glacier_small <- ne_download(type = "glaciated_areas", category = "physical",
scale = "small", returnclass = "sf")## Reading layer `ne_110m_glaciated_areas' from data source
## `C:\Users\kramer\AppData\Local\Temp\Rtmpwviz70\ne_110m_glaciated_areas.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 11 features and 3 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -180 ymin: -89.9989 xmax: 179.9999 ymax: 83.36507
## Geodetic CRS: WGS 84
glacier_large <- ne_download(type = "glaciated_areas", category = "physical",
scale = "large", returnclass = "sf")## Reading layer `ne_10m_glaciated_areas' from data source
## `C:\Users\kramer\AppData\Local\Temp\Rtmpwviz70\ne_10m_glaciated_areas.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 1886 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -180 ymin: -89.99993 xmax: 180 ymax: 83.55756
## Geodetic CRS: WGS 84
ggplot() +
geom_sf(data = world, color = "grey80", fill ="grey80") +
geom_sf(data = glacier_small, color = "grey40", fill = "grey40") +
coord_sf(crs = "+proj=eqearth") +
theme_void()
ggplot() +
geom_sf(data = world, color = "grey80", fill ="grey80") +
geom_sf(data = glacier_large, color = "grey40", fill = "grey40") +
coord_sf(crs = "+proj=eqearth") +
theme_void(){rgeoboundaries}
The {rgeoboundaries}
package uses the Global Database of Political Administrative
Boundaries that provide generally accepted political borders. The data
are licensed openly.
library(rgeoboundaries)
## you might be asked to create a cache folder -> click yes
## the following takes a while to load. Be patient.
gb_adm0() ## political boundary file## Simple feature collection with 218 features and 3 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -180 ymin: -90 xmax: 180 ymax: 83.63339
## Geodetic CRS: WGS 84
## First 10 features:
## shapeGroup shapeType shapeName geometry
## 1 AFG ADM0 Afghanistan MULTIPOLYGON (((74.88986 37...
## 2 GBR ADM0 United Kingdom MULTIPOLYGON (((33.01302 34...
## 3 ALB ADM0 Albania MULTIPOLYGON (((20.07889 42...
## 4 DZA ADM0 Algeria MULTIPOLYGON (((8.641941 36...
## 5 USA ADM0 United States MULTIPOLYGON (((-168.1579 -...
## 6 ATA ADM0 Antarctica MULTIPOLYGON (((-60.06171 -...
## 7 ATG ADM0 Antigua & Barbuda MULTIPOLYGON (((-62.34839 1...
## 8 ARG ADM0 Argentina MULTIPOLYGON (((-63.83417 -...
## 9 AND ADM0 Andorra MULTIPOLYGON (((1.725802 42...
## 10 AGO ADM0 Angola MULTIPOLYGON (((11.7163 -16...
## plotting takes a long time!!!
ggplot(gb_adm0()) +
geom_sf(color = "grey40", lwd = .2) +
coord_sf(crs = "+proj=eqearth") +
theme_void()Lower administrative levels are available as well, e.g. in Germany adm1 represents federal states (“Bundesländer”), adm2 districts (“Kreise”) and so on.
Let’s plot the admin 1 levels for the DACH countries:
dach <- gb_adm1(c("germany", "switzerland", "austria"), type = "sscgs")
ggplot(dach) +
geom_sf(aes(fill = shapeGroup)) +
scale_fill_brewer(palette = "Set2") +
theme_void(){osmdata}
OpenStreetMap (https://www.openstreetmap.org) is a collaborative
project to create a free editable geographic database of the world. The
geodata underlying the maps is considered the primary output of the
project and is accessible from R via the {osmdata}
package.
We first need to define our query and limit it to a region. You can explore the features and tags (also available as information via OpenStreetMap directly).
## [1] "4wd_only" "abandoned" "abutters" "access" "addr" "addr:city"
## # A tibble: 6 × 2
## Key Value
## <chr> <chr>
## 1 craft agricultural_engines
## 2 craft atelier
## 3 craft bakery
## 4 craft basket_maker
## 5 craft beekeeper
## 6 craft blacksmith
## building the query, e.g. beekeepers
beekeeper_query <-
## you can automatically retrieve a bounding box (pr specify one manually)
getbb("Berlin") %>%
## build an Overpass query
opq() %>%
## access particular feature
add_osm_feature("craft", "beekeeper")
## download data
sf_beekeepers <- osmdata_sf(beekeeper_query)Now we can investigate beekeepers in Berlin:
## [1] "bbox" "overpass_call" "meta"
## [4] "osm_points" "osm_lines" "osm_polygons"
## [7] "osm_multilines" "osm_multipolygons"
## Simple feature collection with 6 features and 27 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 13.24443 ymin: 52.35861 xmax: 13.69093 ymax: 52.573
## Geodetic CRS: WGS 84
## osm_id name addr:city addr:country addr:housenumber addr:postcode
## 358407135 358407135 <NA> <NA> <NA> <NA> <NA>
## 358407138 358407138 <NA> <NA> <NA> <NA> <NA>
## 417509803 417509803 <NA> <NA> <NA> <NA> <NA>
## 417509805 417509805 <NA> <NA> <NA> <NA> <NA>
## 597668310 597668310 <NA> <NA> <NA> <NA> <NA>
## 597668311 597668311 <NA> <NA> <NA> <NA> <NA>
## addr:street addr:suburb check_date contact:email contact:phone
## 358407135 <NA> <NA> <NA> <NA> <NA>
## 358407138 <NA> <NA> <NA> <NA> <NA>
## 417509803 <NA> <NA> <NA> <NA> <NA>
## 417509805 <NA> <NA> <NA> <NA> <NA>
## 597668310 <NA> <NA> <NA> <NA> <NA>
## 597668311 <NA> <NA> <NA> <NA> <NA>
## contact:website craft email facebook instagram man_made opening_hours
## 358407135 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 358407138 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 417509803 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 417509805 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 597668310 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 597668311 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## operator organic phone product shop source website wheelchair works
## 358407135 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 358407138 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 417509803 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 417509805 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 597668310 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 597668311 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## geometry
## 358407135 POINT (13.69068 52.35918)
## 358407138 POINT (13.69093 52.35894)
## 417509803 POINT (13.68991 52.35888)
## 417509805 POINT (13.6902 52.35861)
## 597668310 POINT (13.24445 52.573)
## 597668311 POINT (13.24443 52.57295)
beekeper_locations <- sf_beekeepers$osm_points
gb_ber <- gb_adm1(c("germany"), type = "sscgs")[3,] # the third element is Berlin
ggplot(beekeper_locations) +
geom_sf(data = gb_ber) +
# geom_sf(data = d6berlin::sf_berlin) + ## alternative to geoboundaries, but d6berlin needs to be loaded first
geom_sf(size = 2) +
theme_void(){elevatr}
The {elevatr}
(https://github.com/jhollist/elevatr/) is an R package
that provides access to elevation data from AWS (Amazon Web Services)
Open Data Terrain Tiles and the Open Topography Global data sets API
(application programming interface) for raster digital elevation models
(DEMs).
We first need to define a location or bounding box for our elevation
data. This can either be a data frame or a spatial object. We use an
sf object which holds the projection to be used when
assessing the elevation data:
library(elevatr)
## manually specify corners of the bounding box of the US
bbox_usa <- data.frame(x = c(-125.0011, -66.9326),
y = c(24.9493, 49.5904))
## turn into spatial, projected bounding box
sf_bbox_usa <- st_as_sf(bbox_usa, coords = c("x", "y"), crs = 4326)Now we can download the elevation data with a specified resolution z (ranging from 1 to 14 with 1 being very coarse and 14 being very fine).
Optional - composite plots
Composite plots: plotting several layers and data types
Simple composite plot
Plot the hillshade (3D relief) and add temperature colours on top: alpha value gives semi-transparency. The extent is plotted as a red box, and the centroid coordinates of the raster cells are plotted as points.
plot(Bor_hs, col = grey(0:100/100), legend = FALSE)
plot(ras_bio_asc_24, col = terrain.colors(25, alpha = 0.3), add = TRUE)
points(crds(ras_bio_asc_01_cr), cex = 0.1, pch = '+')
plot(ext(ras_bio_asc_01_cr), add = TRUE, col = 'red')Composite plot with {tmap}
In the following, we will create a SpatialPointDataFrame
from the small clipped/ cropped raster with the {sf}
package and plot it on top of the hillshade. Use the possibility of
selecting/ disregarding different layers (layer icon on the left).
hillsh <- Bor_hs
# make sf object from coordinates of the raster
ras_bio_asc_01_cr_sf <- st_as_sf(
data.frame(crds(ras_bio_asc_01_cr)),
coords = c("x", "y"), ## define columns for the coordinates
crs = 4326, ## define crs, 4326 is the EPSG code
sf_column_name = "geometry" ## sf needs a geometry column and you have to name it
)
# interactive plot
tmap_mode(mode = "view")
tm_shape(shp = hillsh, raster.downsample = TRUE) +
tm_raster(palette = "Greys") +
tm_shape(shp = ras_bio_asc_24, raster.downsample = TRUE) +
tm_raster(palette = grDevices::topo.colors(20),alpha = 0.3) +
tm_shape(shp = ras_bio_asc_01, raster.downsample = TRUE) +
tm_raster(palette = grDevices::rainbow(10), alpha = 0.3) +
tm_shape(shp = ras_bio_asc_01_cr_sf) +
tm_dots(shape = 20, size = 0.01) What do the shape-arguments mean, e.g. shape = 20? Have
a look here at the symbols: https://ggplot2.tidyverse.org/articles/ggplot2-specs.html?q=shape#sec:shape-spec
Unfortunately, in {tmap} you can only use
shape = 1 and shape = 20, only circles are
plotted.
# static plot
tmap_mode(mode = "plot")
tm_shape(shp = hillsh, raster.downsample = TRUE) +
tm_raster(palette = "Greys") +
tm_shape(shp = ras_bio_asc_24, raster.downsample = TRUE) +
tm_raster(palette = grDevices::terrain.colors(10), alpha = 0.3) +
tm_shape(shp = ras_bio_asc_01_cr_sf) +
tm_dots(shape = 1, size = 0.05) An Advanced Map with {tmap}
At the end, an example of a beautiful map, created with
tmap:
tmap_mode(mode = "plot")
tm_shape(shp = hillsh) +
## hillshading
tm_raster(palette = "Greys",
legend.show = FALSE,
alpha = 0.75) +
## topographical model
tm_shape(shp = ras_bio_asc_24) +
tm_raster(palette = terrain.colors(25),
alpha = 0.2,
legend.show = FALSE) +
## rivers
tm_shape(shp = River_shp) +
tm_lines(col = "dodgerblue3") +
## protected areas
tm_shape(shp = PA_shp) +
tm_polygons(border.col = "white",
alpha = 0) +
## locations
tm_shape(shp = recs_sf) +
tm_dots(shape = 16,
size = 0.3) +
tm_shape(shp = insidePA) +
tm_dots(col = "red",
size = 1,
shape = 3) +
## styling
tm_layout(title = "END OF SESSION",
title.color = "darkgrey",
title.position = c(0.05, 0.9),
title.size = 2)End optional.
Recap: the most important functions for spatial data
TL;DR ? (too long didn’t read?). Then at least remember the following most important commands for loading, projecting and saving spatial data.
Load spatial data
Assign CRS to or project spatial data into another CRS
Exercise
**Revisit the data set from course 1 on wild boar observations in
Berlin: data_wb_melden_en.csv. (it is here:
....\d6_teaching_collection\data\data_berlin\animal_data).
Now, we want to study the spatial patterns of the wild boar observations. We may hypothesize that wild boar observations are also related to local differences in weather. Answer the following question using spatial data sets and visualizations:**
- Question 1.1) Were wild boar with many piglets seen more often in warm parts of the city?
- Question 1.2) Were wild boars more often observed at colder spots of the city during sunny weather?
- Question 1.3) Additional question for the fast ones: Were large groups of wild boar more frequently seen in areas providing a dense tree cover?
Follow these steps to answer the questions:
- Load the wild boar data (
data_wb_melden_en.csv) and spatial data on temperature in Berlin (summer_temp_100m_3035.ascindata_berlin). Remember to set the correct CRS to the raster! (Hint: we need the Lambert Azimuthal Equal Area (LAEA) projection for Europe. Tip: always save the CRS as EPSG code to the filename ;-) )
- Load the wild boar data (
- Turn the wild boar data into a simple features object. Remember to set the correct CRS (the wild boar locations were collected in WGS 84) and to transform it to the same CRS as the raster afterwards!
- Inspect both spatial data sets by plotting the temperature raster with the wild boar locations on top using the R basic plot function. The locations should match the map (if not something went wrong when setting the CRS; check if they are the same).
- Select one of the temperature layers and extract the values for each wild boar location.
- Create a plot that visualizes the temperature at each wild boar observation with 3 or more piglets (y axis) for each number of piglets (e.g. make a boxplot and plot the raw data on top).
- Create a new variable that holds the weather category as either “sunny” or “other”.
- Visualize the temperature at each wild boar observation (y axis) for sunny and other weather (box and whisker plot + raw data).
- Save the wild boar observations with the local temperature information as an .Rds file.
For the additional question:
- Load the tree cover data set
(
tree_cover_density_2012_100m_3035.ascindata_berlin).
- Load the tree cover data set
(
- Inspect the data set by plotting the raster using the R basic plot function.
- Extract the tree cover within a buffer of 100m around each wild boar
location (hint: use the help to inspect the arguments of
extract()).
- Extract the tree cover within a buffer of 100m around each wild boar
location (hint: use the help to inspect the arguments of
- Create a boxplot that shows the tree cover (y axis) based on the group size.
Session Info
## R version 4.3.2 (2023-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19044)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=C
##
## time zone: Europe/Berlin
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] elevatr_0.99.0 osmdata_0.2.5 rgeoboundaries_1.2.9000
## [4] rnaturalearth_1.0.1.9000 here_1.0.1 units_0.8-5
## [7] patchwork_1.2.0 viridis_0.6.4 viridisLite_0.4.2
## [10] tmap_3.3-4 ggplot2_3.4.4 stars_0.6-4
## [13] abind_1.4-5 sp_2.1-2 sf_1.0-15
## [16] rasterVis_0.51.6 lattice_0.22-5 terra_1.7-65
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.1 deldir_2.0-2 gridExtra_2.3
## [4] httr2_1.0.0 tmaptools_3.1-1 s2_1.1.6
## [7] rlang_1.1.3 magrittr_2.0.3 e1071_1.7-14
## [10] compiler_4.3.2 png_0.1-8 systemfonts_1.0.5
## [13] vctrs_0.6.4 stringr_1.5.1 rvest_1.0.3
## [16] httpcode_0.3.0 crayon_1.5.2 pkgconfig_2.0.3
## [19] wk_0.9.1 fastmap_1.1.1 ellipsis_0.3.2
## [22] lwgeom_0.2-13 rmdformats_1.0.4 leafem_0.2.3
## [25] labeling_0.4.3 utf8_1.2.4 rmarkdown_2.25
## [28] ragg_1.2.6 purrr_1.0.2 xfun_0.41
## [31] cachem_1.0.8 jsonlite_1.8.8 progress_1.2.3
## [34] highr_0.10 jpeg_0.1-10 prettyunits_1.2.0
## [37] parallel_4.3.2 R6_2.5.1 stringi_1.8.2
## [40] bslib_0.6.0 RColorBrewer_1.1-3 hoardr_0.5.4
## [43] lubridate_1.9.3 jquerylib_0.1.4 Rcpp_1.0.12
## [46] bookdown_0.36 knitr_1.45 triebeard_0.4.1
## [49] zoo_1.8-12 base64enc_0.1-3 leaflet.providers_2.0.0
## [52] timechange_0.2.0 rstudioapi_0.15.0 dichromat_2.0-0.1
## [55] yaml_2.3.7 codetools_0.2-19 curl_5.2.0
## [58] tibble_3.2.1 leafsync_0.1.0 withr_2.5.2
## [61] evaluate_0.23 proxy_0.4-27 xml2_1.3.5
## [64] pillar_1.9.0 KernSmooth_2.23-22 generics_0.1.3
## [67] rprojroot_2.0.4 hms_1.1.3 munsell_0.5.0
## [70] scales_1.2.1 class_7.3-22 glue_1.7.0
## [73] tools_4.3.2 interp_1.1-5 hexbin_1.28.3
## [76] XML_3.99-0.16.1 grid_4.3.2 slippymath_0.3.1
## [79] urltools_1.7.3 crosstalk_1.2.1 latticeExtra_0.6-30
## [82] colorspace_2.1-0 raster_3.6-26 cli_3.6.2
## [85] rappdirs_0.3.3 textshaping_0.3.7 fansi_1.0.5
## [88] countrycode_1.5.0 gtable_0.3.4 selectr_0.4-2
## [91] sass_0.4.7 digest_0.6.34 progressr_0.14.0
## [94] classInt_0.4-10 crul_1.4.0 htmlwidgets_1.6.3
## [97] farver_2.1.1 memoise_2.0.1 htmltools_0.5.7
## [100] lifecycle_1.0.4 leaflet_2.2.1 httr_1.4.7
## [103] mime_0.12